{r setup, include=FALSE} # knitr::opts_chunk$set(echo = TRUE)

Introduction

MLB (Major League Baseball), since the advent of big data, has proven to be ripe for analysis due to the many factors and intricacies within the game, best exemplified by its advanced metrics. In this report, we investigate MLB pitcher Corbin Burnes and dive into his evolution as a pitcher over a five-year span of 2019-2023, taking a look at some of the advanced metrics behind his pitches. Corbin Burnes was our selection due to him having a very interesting career, especially over his last five seasons. Burnes struggled mightily after making his MLB debut in 2018, with the struggles continuing in 2019 before he found success and dominance in 2020, finishing 6th in NL (National League) Cy Young voting, the highest honor a pitcher can achieve. His dominance continued in 2021 and to the present, with our interval for this report covering his 2019 through 2023 seasons.

Through analysis of Burnes’ pitches, we plan to create a “good” or “bad” pitch function based on run expectancy, which is the average number of runs an average team is expected to score during an inning. This metric works to credit a pitcher’s role in not allowing a run in a given inning. With our function, we can make inferences on Burnes’ performance as a pitcher, which revolves around his ability to prevent runs.

Related to our analysis of his performance are the metrics we use to evaluate and infer. We chose the advanced metrics of spin rate, batting average, wOBA, and effective speed. Spin rate is measured by rotations per second, with breaking pitches like curveballs and sliders having the most break. Fastballs and its variations, like cutters and sinkers, also have spin, just less so than breaking balls. Generally, the most effective pitches of any variation have a lot of spin or break on it. Batting average is measured by the rate at which the pitcher allows a hit of any kind, whether it is a single, double, triple, or home run.

wOBA is a variation of on-base percentage, which is the rate at which a pitcher allows the hitter to reach base, whether it is from any kind of hit, a walk, or a hit-by-pitch. The difference with wOBA is how it factors how a player reaches base, giving more weight to extra-base hits. Effective speed is how fast the ball is tracked when it crosses home plate, with fastballs and its variations being faster pitches generally than breaking pitches. With these metrics, we can delve into how Burnes may have achieved his success from 2020-2023 as we compare and contrast his metrics from that span to his 2019 season. Additionally, with our “good”/”bad” pitches function, we will be able to see what his metrics were at the pitches that achieved the most and least success, giving us an idea as to why he was and was not successful in those instances.

Exploratory Data Analysis

First, we will begin by creating our webscraping function to gather all of our necessary data from MLB Statcast.

All Pitches by Burnes (2019-2023)

Displayed is a small tibble of Corbin Burnes’ pitches in every game played from 2019 to 2023, with metrics of release speed and run expectancy accompanied.

Burnes’ Average Spin Rate, Estimated wOBA, Estimated Batting Average, and Effective Speed Per Game

This data frame displays the value of Corbin Burnes’ average effective speed, estimated wOBA, estimated batting average, and average spin rate in every single game he has played in, with the year and value for each metric.

Burnes’ Total Average Spin Rate, Estimated wOBA, Estimated Batting Average, and Effective Speed Per Year

For every year between 2019 and 2023, we obtain the total averages of Corbin Burnes’ average spin rate, estimated wOBA, estimated batting average, and effective speed. The total averages are displayed in the 5x5 tibble.

Visualizing Burnes’ Pitching Progress and Prowess

Graph of Average Pitching Stats from 2019-2023

Four graphs are displayed, for each metric discussed in the previous table. Effective speed is presented as miles per hour (MPH), wOBA and batting average are presented as a percentage, and spin rate is presented as rotations per second. These metrics are tracked over each of the five years.

Of note, effective speed had the highest variance in 2019, possibly indicating how Burnes may have thrown more pitches of different type and speed that year. Examining effective speed for the next four years, the lines are much more straight than 2019’s, indicating that Burnes did not mix his pitches as much.

For estimated batting average, the graph tracks Burnes’ opponents’ batting averages off of his pitches for every start, and we see for 2019 how varied his estimated batting average is. The line graph reaches a peak higher than any of the other years, and its low is still over .250, which would indicate that opponents had great success getting hits off Burnes in 2019. The lines are much smaller and tighter from 2021 to 2023, sitting lower than 2019. This indicates that Burnes had much more success preventing hits from 2021 to 2023.

Further, we examine the wOBA graph, which is similar to batting average, but as a metric accounts more for how a batter reaches base, i.e. from walks or extra base hits. Like the estimated batting average graph, the estimated wOBA is much less clean for Burnes in 2019. His estimated wOBA from 2020 to 2023 has less variance, with his 2021 and 2022 seasons in particular having tighter distributions. Lastly, we see from Burnes’ spin rate graphs that his spin rate in 2019 is all over the place. A possible and logical explanation for this can be connecting it to his effective speed also having high variance. Burnes had a more varied pitch mix in 2019, leading to many different speeds and spin rates due to using different types of pitches more often. It is well-documented that Burnes switched to using his cutter pitch far more frequently in his seasons after 2019, which would explain the tighter distributions in the years following 2019.

Graphs of Raw Stats from 2019-2023

Effective Speed:

We again see the line graph for 2019 being more varied compared to 2020 to 2023, as a result of Burnes using a more varied pitch mix in 2019. We can also infer from the graphs from 2020 to 2023 that in 2020, when Burnes started using less pitches and choosing the cutter more often, that the line begins to get tighter and smoother with every year, most likely as a result of Burnes becoming more comfortable with his pitch mix, specifically his cutter.

Estimated Batting Average:

Burnes’ line graph for his estimated batting average in 2019 sits at a level higher than the other years, the reasons for which we have discussed in the previous section. Interestingly, we can take a look at 2023 compared to 2021 and 2022 and infer that Burnes struggled in 2023 relative to how dominant he was in 2021 and 2022.

Estimated wOBA:

Again, from these graphs we get a similar inference as estimated batting average, and we can see that Burnes struggled mightily in 2019, found his footing from 2020-2023, but also struggled relative to the new standard he had set in 2023.

Spin Rate:

We can get a closer look at the spin rates per game for Burnes in our five-year span, seeing that in 2019, his spin rate took a steep drop in the middle of the season, likely a result of Burnes experimenting with his pitch mix. We see that Burnes’ spin rate slowly increased over the course of the COVID-19 shortened season in 2020. 2021 had Burnes roughly emulating an upside-down bell curve with his spin rate, and he has comparatively cleaner average spin rates in 2022 and 2023.

Diving Deeper: Burnes’ Good Pitches vs. Bad Pitches

We filter Burnes’ pitches by either “good” pitches or “bad” pitches. We define “good” pitches to have a run expectancy below 0, and a “bad” pitch to have a run expectancy above 0. Essentially, after a pitch is thrown, a bad pitch will cause the probability of a batter scoring to increase while a good pitch will lower the probability of scoring.

We want to portray how high Burnes’ highs are as a pitcher and how low his lows are here. We know that Burnes achieved great success post-2019 season, but we want to see how his best and worst starts (with every pitch) stack up to each other.

Burnes’ Average Spin Rate, Estimated wOBA, Estimated Batting Average, and Effective Speed Per Year

Comparing the tables, we can highlight how in his 2019 season, his estimated wOBA is both the highest and lowest for good and bad pitches across both tables. His peak .098 wOBA is the lowest mark for good pitches, and his low .745 wOBA is the highest in the bad pitches table. We can also note that while these tables may portray Burnes to have similar seasons for 5 straight years, the 2019 season features far more “bad” pitches than “good” ones, and that our pitch system only highlights him on a per-start basis based on the good/bad.

Graph of Burnes’ Good Pitches from Average Stats

These graphs illustrate Burnes’ average metrics from only his good pitches. What we can obtain from examining his graph for effective speed in 2019 is the variance in what his good pitches were, once again owing to his more varied pitch mix. Once Burnes fell in love with his cutter after this season, we see that the effective speed graphs from 2020-2023 are much more straight, indicating that his good pitches were of similar speeds around 90 MPH, indicating some type of fastball being thrown as a good pitch (his cutter). From the estimated batting average graph, we can see the inconsistency in Burnes during his 2019 season, with even good pitches resulting in an estimated batting average of close to .300, which is not good. Looking at his 2021-2023 seasons, the batting average graph has less variance and hovers around .200, which indicates how effective his good pitches were. For wOBA, it’s a similar story with 2021-2023 being much cleaner, but we can see how 2020 was also up-and-down like 2019, albeit with a smaller sample size. The spin rate for 2019 peaks at a level much higher than the other years, due to Burnes likely choosing to use more breaking pitches (more spin). We can see how Burnes commits more to his good pitches having less spin in 2022 and 2023.

Graphs of Good Pitches from Raw Stats

Effective Speed:

Across every pitch Burnes throws over our five-year interval, we can again see how the speed of his good pitches varied heavily in 2019, varied a little across a small sample size in 2020, and straightened out from 2021-2023.

Estimated Batting Average:

We can see that Burnes’ best pitches from 2019 and 2020 brought him the best results across our five-year interval, but they also brought him the worst results from what we would consider to be good pitches. This speaks to the overall inconsistency Burnes experienced in his career prior to finding a groove in 2021, something he’d carry on in 2022 and beyond.

Estimated wOBA:

We can once again point out that Burnes’ good pitches resulted in much tighter graphs for 2021-2023 as he figured out his pitching.

Spin Rate:

What is fascinating to note here is that we are tracking Burnes’ pitches across entire seasons. With this, we can see in 2019 how his good pitches start as being ones with high spin rates, then by the middle of the season, his best pitches have less spin, possibly indicating how he has success more so with pitches that break less (hint, it’s his cutter). We see this trend again in 2021, with tighter graphs in 2022 and 2023.

Graph of Burnes’ Bad Pitches from Avg Stats

Looking at the graphs for Burnes’ bad pitches from his average metrics, we can see that there is not much of a difference between the average effective speed per start from the good pitches graph and the bad pitches graph, indicating that he succeeded and failed with pitches of all speeds. Where we see a noticeable spike is in the 2019 and 2020 years for average estimated batting average, where pitches deemed as “bad” resulted in estimated batting averages at about .700 and above. These are extremely high averages, worthy of one making poor pitches, so it comes as no surprise that our function for pitch quality had these at negative run expectancies. We see the spikes and variance once again for the 2019 season when it comes to wOBA, and once again more clean and tighter variances for 2022 and 2023 in particular, the sign of a pitcher that has gotten increasingly more comfortable with his repertoire. What we can note from the spin rate graph is how Burnes got poor results from bad pitches of all spin types, especially in 2019. This tells us that Burnes struggled with basically all of his pitches in 2019 due to a more varied pitch mix. The rest of the seasons, especially 2021-2023 tell us that Burnes made bad pitches with a more selective pitch mix, likely his cutter. Comparing this to the good pitches graph, we can infer that when Burnes does not spin his cutting fastball enough, his pitches have poor run expectancy, and when he gets a good amount of spin, his pitches’ run expectancy is over zero.

Graphs of Burnes’ Bad Pitches from Raw Stats

Effective Speed:

The graph for effective speed indicates that Corbin Burnes’ effective speed of bad pitches varied significantly more in 2019 than it did in the following 4 years. At the beginning and end of the 2019 season, Burnes consistently threw bad pitches at a high speed. In the middle of the 2019 season, Burnes’ effective speed of bad pitches dropped to approximately 85 MPH. This drop in effective speed might imply that Burnes was struggling to adjust to the MLB early in his career. As a result, he attempted to change strategies and throw different types of pitches at a lower speed. Unfortunately, this adjustment in strategy still resulted in bad pitches. In the 2020 season, Burnes only played approximately half the games as normal due to Covid-19. Throughout 2021 to 2023, the effective speed of Burnes’s bad pitches illustrated relatively fast speed with minor variations.

Estimated Batting Average:

Corbin Burnes’ Opponent’s Estimated Batting Average peaked at the beginning of the 2019 season and at the end of the 2023 season. Considering that low estimated batting average indicates good performance from the pitcher, these were times when Burnes was struggling. At the start of the 2021 season, Corbin Burnes’ estimated batting average for bad pitches was significantly lower than previous years, indicating that he was throwing his best pitches. For the rest of the 2021 season, his batting average fluctuated and slowly but steadily increased. By the end of the 2021 season, Burnes’ batting average for bad pitches had increased to 0.62, indicating that his opponents were performing better against him. This batting average remained relatively consistent until the end of the 2023 season, where his batting average significantly increased to almost 0.8, indicating that Burnes was performing extremely poorly.

wOBA:

Similar to EBA, the Weighted Opponent’s Batting Average for Corbin Burnes was significantly higher at the beginning of 2019 and at the end of 2023. Considering that this batting average estimate is weighted, this measurement is more accurate than EBA. We can see a significant drop in wOBA for Burnes’ bad pitches in the middle of 2021 and at the end of 2020. This graph further supports our analysis of the EBA graph.

Spin Rate:

The Spin Rate of Corbin Burnes’ bad pitches significantly dropped after the beginning of the 2019 season. In 2020, Burnes started the season with a slower spin rate and consistently ramped up throughout the whole season. During the 2021 season, he started and finished the season with a similar spin rate, but used less spin in the middle of the season. For both 2022 and 2023, Corbin Burnes started the season with a relatively low spin rate and steadily increased throughout the season.

Observing Burnes’ Pitch Type Distribution

An integral part to a pitcher’s success is the various pitches they have in their arsenal. Burnes specifically provides a great example of how a pitcher can greatly improve after honing a specific pitch. As we have displayed earlier, Burnes has greatly improved from his second season (2019) – largely because he was able to develop a Cutter pitch. We’ll explore how this distribution changed over time.

Over this five year span, we see a drastic change in Burnes gameplan. While his fastball initially had the lions share of his pitches, he quickly transitioned towards developing a fastball cutter. By 2021, he rarely threw any fastballs. He also noticeably relies less on his slider. In 2023, his pitch distribution numerically was 55.5% Cutters, 17.4% Curveballs, 11.1% Changeups, 8.3% Sliders, 7.7% Sinkers, and 0 fastballs.

From here, we can start to look which types of pitches have tended to yield negative delta run expectancy (good pitches) and a positive delta run expectancy (bad pitches).

Above, we can see that the distribution of pitching types for good pitches resembles the overall distribution very closely. This can be attributed to the fact that as shown above, the majority of his pitches are considered “good”. The largest noteable change is that while making up 55% of his overall pitches, cutters make up 60% of his good pitches in 2023.

There are some noticeable changes with the bad pitch distribution from the overall pitch distribution. While Cutters constitute 55% of his overall pitches and 60% of his good pitches, they make up only 50% of his bad pitches. Looking solely at the raw numbers, in 2023, if Burnes were to throw a cutter, it is 1.56x more likely to decrease the opposing batters chances of scoring than it would increase their chances of scoring.

Correlation Heatmaps by Pitch

The following graphs are heatmaps, designed to show how correlated the variables are to each other. We define these variables as such:

release_speed, speed of pitch (mph) when released effective_speed, speed of pitch (mph) when crossing plate release_pos_x, x-coordinate of release point of pitch release_pos_y, y-coordinate of release point of pitch release_pos_z, z-coordinate of release point of pitch pfx_x, horizontal movement of pitch pfx_z, vertical movement of pitch vx0, x-coordinate of pitch velocity vy0, y-coordinate of pitch velocity vz0, z-coordinate of pitch velocity ax, x-coordinate of pitch acceleration ay, y-coordinate of pitch acceleration az, z-coordinate of pitch acceleration delta_run_exp, expected chance of scoring

On the heatmap, the boxes shaded red depict a positive correlation between the variables, and the boxes shaded blue depict a negative correlation between the variables. We create a heatmap for each pitch Burnes throws, and we take a look at each variable’s relationship with our overall metric of run expectancy. We will look at Burnes five primary pitches. Fastballs are noticeably absent as they constitute 0-1% of his pitches after 2019.

Cutters:

We can pinpoint the hottest and coldest zones in this heatmap for Burnes’ cutters. We see that for cutters, release speed and the y-coordinate for pitch velocity have a negative correlation in terms of change in run expectancy, meaning that hitters tend to do worse in terms of this relationship. Of note, we can also pinpoint the negatively correlated relationship of effective speed and the y-coordinate for pitch velocity, and the negatively correlated relationship between the z-coordinate of pitch acceleration and the y-coordinate of pitch velocity. These two relationships result in a change in run expectancy of -0.47 and -0.46, respectively. What we can also note for this heatmap is that the cutter is known as Burnes’ most effective pitch and also a variation of a fastball, which can explain some of these relationships. In terms of positively correlated relationships, we see that there is one between horizontal movement of pitch and the x-coordinate of pitch acceleration and one between vertical movement of pitch and z-coordinate of pitch position. Both relationships have a +0.99 change in run expectancy, indicating that these relationships between variables cause Burnes to allow runs and struggle. We can note that these variables are linked through axes, with the relationships following the horizontal or vertical axes and matching them.

Changeup:

Moving on to the correlation heatmap for Burnes’ changeups, we see notable positively correlated relationships between release speed and effective speed, and horizontal movement of pitch and the x-coordinate of pitch acceleration. These two relationships have change in run expectancies of almost +1, indicating that these relationships do not bode well for Burnes’ changeups. On the other hand, we see that there is a negatively correlated relationship between release speed and the y-coordinate of pitch velocity and between effective speed and the y-coordinate of pitch velocity. Noting that the changeup pitch is reliant on tricking the hitter on its speed, it makes sense that these two variable relationships can cause hitters to struggle against Burnes’ changeups.

Curveballs:

For Burnes’ curveballs, we see that there are positively correlated relationships between horizontal movement of pitch and the x-coordinate of pitch acceleration and vertical movement of pitch and the z-coordinate of pitch acceleration of nearly +1 change in run expectancy, indicating that these variables can lead hitters to have success against Burnes’ curveballs. Analyzing this success, it would make sense that these variables are important to the change in run expectancy considering that the curveball pitch is reliant on its break and location. Curveballs usually tend to have a lot of vertical or horizontal movement, and it appears as though hitters have success against Burnes in this regard. Connecting this to our prior “bad” pitches graph, these curveballs may be the pitches with high spin rate that lie in that graph, especially since curveballs have a lot of spin on them. Burnes sees success in the relationships between the y-coordinate of pitch velocity and release speed, between effective speed and the y-coordinate of pitch velocity, and between the z-coordinate of pitch velocity and release speed, at -1, -0.61, and -0.55 changes in run expectancy, respectively. Due to curveballs being slower pitches, it would make sense that Burnes sees success from the speed aspect of these pitches, since they may catch hitters off-guard.

Sinkers:

Moving on to Burnes’ sinker pitches, we note that Burnes struggles in the regard of the positively correlated relationship between the horizontal movement of the pitch and the x-coordinate of pitch acceleration and with the positively correlated relationship between the vertical movement of the pitch and the z-coordinate of pitch acceleration, as well as the relationship between release speed and effective speed. Explaining these relationships, it could be that Burnes does not have enough “sink” or break, on his sinkers, causing hitters to have success against them, as well as potentially lacking enough velocity to be successful. On the other hand, we see that Burnes has success in terms of having a negative change in run expectancy with the negative relationships of the y-coordinate of pitch velocity with effective speed and release speed, once again implying that it may matter where Burnes locates his sinker in order for him to be successful.

Sliders:

For Burnes’ heatmap for sliders, there are many “hot” and “cold” zones to examine. Skimming through the graph, we can see that where Burnes is successful in terms of preventing runs is when he masters his speeds with where he locates his slider. These relationships have changes in run expectancies between -0.45 and -1, while the relationships of horizontal movement of pitch with the two speeds have changes in run expectancies of -0.63 and -0.59. For the latter two relationships, this makes a lot of sense considering that the slider is a pitch that must have horizontal break to be effective. We see that Burnes struggled with the relationship between effective speed and release speed, which tracks due to the slider not being important in the area of velocity, moreso having break and movement. We also see that the variables that deal with the break of the pitch (horizontal and vertical) have positive relationships with the variables that deal with the acceleration of the pitch.

Conclusion

Through this report, we sought to analyze MLB pitcher Corbin Burnes and his pitches. From examining Burnes through the lens of the advanced metrics of effective speed, spin rate, estimated batting average, we got a surface-level look at how he became a completely different pitcher after struggling in the 2019 season. By making a function that deemed all of his pitches either “good” or “bad”, we were able to use the aforementioned metrics to see how he changed his pitches and how each pitch in every season fared, filtered by either being a “good” or “bad” pitch based on how it changed the run expectancy. We supplemented the findings from our filtered function by looking at how Burnes’ pitch distribution has changed over our five-year span, showing us how he developed a cutter pitch and rode it to success in 2020 and beyond. This information goes along with our findings from analyzing his good and bad pitches, because it gives us implications that we can confirm and connect to what we know. For example, Burnes’ “good” pitches in his later years in the interval were ones of lesser spin rate. That tells us that those pitches were likely his cutter and not his breaking pitches, which appeared to be regulars on his “bad” pitches graph. Furthermore, we created heatmaps to display how each variable feature has a relationship with each other, giving us an idea of what variable relationships lead to positive or negative run expectancies for Burnes for each type of pitch he throws. This allows us to analyze each pitch and use the context of the pitch itself to make conclusions on why or why not the pitch is successful for Burnes. Ultimately, we were able to illustrate and explain Corbin Burnes career within our five-year interval through not only our created function, but the graphs using the metrics and variable features available to us.

Code Appendix